Doing Good or Doing Right? Exploring the Weakness of Commonsense Causal Reasoning Models. (arXiv:2107.01791v1 [cs.CL])
(2 min)
Pretrained language models (PLM) achieve surprising performance on the Choice
of Plausible Alternatives (COPA) task. However, whether PLMs have truly
acquired the ability of causal reasoning remains a question. In this paper, we
investigate the problem of semantic similarity bias and reveal the
vulnerability of current COPA models by certain attacks. Previous solutions
that tackle the superficial cues of unbalanced token distribution still
encounter the same problem of semantic bias, even more seriously due to the
utilization of more training data. We mitigate this problem by simply adding a
regularization loss and experimental results show that this solution not only
improves the model's generalization ability, but also assists the models to
perform more robustly on a challenging dataset, BCOPA-CE, which has unbiased
token distribution and is more difficult for models to distinguish cause and
effect.